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(54) System and method for automatically adding informational hypertext links to received 
documents 

(57) In a distributed computer system, an auto- 
mated document annotation system and method adds 
hypertext cross-references to a set of known informa- 
tion sources into documents requested by a client com- 
puter in such a way that the merged document is 
displayable by existing Web browsers. The distributed 
computer network incorporates a plurality of servers to 
store documents. Each stored document has a unique 
document identifier and is viewable from a client compu- 
ter having a browser configured to request and receive 
documents over the network. An annotation proxy, 
which is a software procedure configured to merge a 
requested document from a first server with hypertext 
links to documents containing associated supplemental 
information. The set of hypertext links and criteria for 
identifying where such links should be added to 
requested documents are defined by one or more dic- 
tionaries of cross-references. The annotation proxy then 
relays the merged document to a receiver unit that is 
selected from another proxy, such as a firewall proxy or 
another annotation overlay proxy, or the browser, which 
ultimately displays the merged document. The annota- 
tion proxy optionally includes a dictionary generator that 
generates a dictionary of references to documents 
requested by the user, each reference in the dictionary 
indicating the textual context of the hypertext link or 
links used to request the associated document. The 
generated dictionary represents information sources 
known and used by the user. The annotation proxy then 
annotates requested documents with cross-references 
in the dictionary that was generated by the annotation 
proxy. 
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established dictionaries, directories, or libraries of information sources for which cross-references should be merged 
into received documents. Then, when the user requests a document, that request should be relayed through the proxy, 
which merges the requested document with cross-references to the user-specified supplemental information sources. 
The resulting merged document should be viewable with any existing Web browser. 

Alternatively, the system should allow a user of the proxy to direct the proxy to generate and add to a dictionary of 
cross-references annotations from sources accessed by the user over a period time. Then, when a user requests a doc- 
ument, the proxy should be able to merge cross-references in the dictionary with the requested document, eliminating 
the need to search the Web for the appropriate supplemental materials. 

SUMMARY OF THE INVENTION 

In summary, the present invention is a system and method for merging hypertext cross-references to a set of known 
information sources with documents requested over the Web in such a way that the merged document is displayable by 
existing Web browsers. , . _ 

Specifically, the present invention provides a system and method for providing hypertext link annotations for docu- 
ments requested over a distributed computer network that incorporates a plurality of servers to store the documents. 
Each stored document has a unique document identifier and is viewable from a client computer having a browser con- 
figured to request and receive documents over the network. 

Another feature of the present invention is an annotation proxy, which is a software procedure configured to merge 
a requested document from a first server with hypertext links to documents containing associated supplemental infor- 
mation where the set of hypertext links and criteria for identifying where such links should be added to requested doc- 
uments are defined by one or more dictionaries of cross-references. The annotation proxy then relays the merged 
document to a receiver unit that is selected from another proxy (possibly a firewall proxy or another annotation overlay 
proxy) or the browser, which ultimately displays the merged document. 

In a preferred embodiment the annotation proxy can generate a dictionary of references to documents requested 
by the user each reference in the dictionary indicating the textual context of the hypertext link or links used to request 
the associated document. The generated dictionary thus represents information sources known and used by the user. 
The annotation proxy can then annotate requested documents with cross-references in the dictionary that was gener- 
ated by the annotation proxy. 

The present invention is also a method usable in the same type of computer network for providing hypertext link 
annotations for a requested document. As a first step, at least one dictionary of hypertext links to supplemental docu- 
ments is stored. A merged document is then formed by merging a requested document stored on a first server with 
hypertext link annotations from the dictionary when the text or other content in the document matches conesponding 
merge criteria. This merged document is then relayed to a receiver selected from another proxy or said browser. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Examples of the invention viii be described in conjunction with the drawings, in which:- 

FIG. 1 is a block diagram of a distributed computer system incorporating the present invention. 

FIG 2 is a block diagram of a preferred embodiment of the present invention, showing the relationship between a 
web client, a web server, and an annotation proxy server agent interposed between the web client and the web 
server. 

FIG. 3 is an illustration ol an exemplary annotation directory showing the contents of a cross reference source field 
and match pattern field. 

FIG. 4 is an illustration of the manner in which an annotation in the form of a hypertext link to a specified URL is 
added to a portion of a document. 

FIG. 5 is an illustration of an exemplary annotation directory of an alternative embodiment of the invention showing 
the contents of a cross reference source field, a match pattern field, and a relevance index field. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Referring to FIG 1 there is shown a distributed computer system 100 having many client computers 102 and at 
least one remotely located information server computer 104. In the preferred embodiment, each client computer 102 is 
connected to the information server 104 via the Internet 106. although other types of communication connections could 
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greater detail hereinafter. 

When web client 1 02 requests a document such as document "Doc3 M 1 66 stored in document storage 1 82 located 
on web server 104b using web browser 1 10. the user associated with client computer 102 also specifies an annotation 
proxy server 1 18. and one of the annotation directories 191. 192 provided on that server. If the annotation proxy server 
5 1 1 8 has only a single annotation directory, such as when the proxy server is resident on the client computer making the 
request and the user has provided an annotation directory for use on all requested documents, then explicit specifica- 
tion of the directory may be unnecessary. Furthermore, in the preferred embodiment the user may specify an annotation 
proxy and set of annotation directories to be used for annotating all future document requests until the user specifies a 
different annotation proxy and/or set of annotation directories. 
io Further, the specification of a particular annotation proxy server 118 may either be specified by an explicit com- 
mand from the client 102 at the time the document is requested or implicitly specified, such as using the proxy server 
118 resident on the client computer as a default if no other proxy server is specified, or based on characteristics of the 
requested document, user history, or other user preferences. When explicit specification of a proxy is required or 
desired, the user associated with the client computer may specify a particular annotation proxy server 1 18 and annota- 
15 tion directory by clicking one or more buttons on the client web page, or by entering an annotation proxy server identifier 
(such as by entering a proxy server name or URL) and an annotation proxy directory name or URL 

A document request on the client computer 102 ultimately results in receipt of a version of the document which is 
annotated with cross references in accordance with the selected annotation proxy sever and annotation directory. The 
specific commands generated and command and data pathways on.the network 1 00 will depend somewhat on the loca- 
20 tions of the requesting client 102. information server 104 storing the requested document, and the annotation proxy 
server 1 18. In particular, the command and data pathways will depend on whether the proxy server 1 18 is resident on 
the requesting dient computer 102. resident on the same information server 104 that is providing the requested docu- 
ment, or provided by a separate annotation proxy computer site on the network 

In one embodiment where the annotation proxy server 118 is provided on the requesting client computer 102. the 
document request command 201 (which may include a requesting client computer identifier, a unique document iden- 
tifier for the requested document an identifier for the proxy server that will annotate the document, and an annotation 
directory identifier when applicable) is routed internally to the proxy server 1 18 which in turn transmits a request to the 
server 104 for the document using the unique document identifier and the requesting computer identifier. Information 
server 104 provides the requested document to the proxy server 118 which applies the identified annotation directory 
to the received document and provides the merged document to the browser 1 10 for viewing on the requesting client 
computer 102. 

Once the request for document is received and recognized by the web server on which the requested document is 
stored, the web server prepares the document and transmits the document to the annotation proxy server 118 (which 
may be the same or a different computer from the requesting client computer) for annotation. If the annotation is per- 
35 formed on a remote proxy server 1 18. then annotation is performed prior to transmission of the document to the client 
102, in a conventional manner. 

In a different embodiment, the requesting computer may receive the unannotated document, retransmit it to any 
desired annotation proxy server and then receive the annotated document back from the proxy server after annotation. 
However, such a system and method are operable they are less efficient. 
40 The manner of annotating a document are now described with reference to FIG. 3. The annotation proxy server 1 1 8 
includes a set of hypertext linking rules or document merger procedures 122 for adding annotations, such as in the form 
of hypertext links, to a requested document. In simplest terms, the annotation proxy server parses the requested doc- 
ument and compares the characters, words, phrases, and the like with match patterns 195 in the selected annotation 
directory. Various search strategies and search engines for performing such comparisons are known in the art and are 
not discussed further. When a pattern identified in the designated annotation drectory 191. 192 is present in the 
requested document an annotation is performed by adding to the requested document one or more cross references to 
the document associated with the identified pattern. 

For example, with reference to FIG. 2. two exemplary annotation directories 1 91 , 1 92 are shown. Each annotation 
directory 191, 192 includes a plurality of paired entries (e.g. 191a, 191b. 191c. 191d. 191e; and 192a. 192b. 192c, 
192d) where each paired entry includes a cross reference document source field 194 and a match pattern field 195. 
Each cross-reference source field 194 identifies the unique location of a cross reference document, and each match 
pattern field 1 95 defines a character pattern (including symbols, words, characters, phrases, numbers, and the like) that 
defines where annotation hyperlinks to the cross reference document should be added to requested documents. 
In reference to FIG. 3. there is shown a more specific example of entries in an annotation directory. Here, the entry 
55 URLX1 corresponds to the generic entry Xref Source 1 . and the entry "music synthesr w/1 0 "signal process'" corre- 
sponds to the generic entry match pattern 1 of annotation directory 1 91 of FIG. 2. The — in the match pattern indicates 
a so called "wild card" character or characters which stand for no characters or one or more characters at that position 
in the text. Use of such wild card characters are known in conventional search techniques and not discussed further. In 
this example, whenever the text string "music synthesr appears within 10 words of the text string "signal process*" in 
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may appear in red. whereas text linked with relevance index Rl=2 may appear in green. 

In embodiments of the invention where the annotation proxy server 1 18 is resident on the web information server 
104 which provided the requested document, the annotation and merging of the original document with the annotations 
to generate a hypertext link annotated document may occur prior to transmission of the document to the client 102. If 
s the annotation proxy server 1 1 8 is resident on a different web information server site than the server which provided the 
requested document or the client computer 102 which requested the document, then the original document is transmit- 
ted to the remote APS 1 1 8 for annotation to generate a hypertext link annotated document, which is then transmitted to 
the client 102. 

Table 1 sets forth a Pseudocode Representation of Annotation Proxy Procedure. The annotation Proxy Procedure 
io may include or invoke one or more of three sub procedures: (1) an Install Cross- Reference Directory subprocedure. (2) 
an Uninstall Cross-Reference Directory, and (3) a Request and Merge Document subprocedure. 

The Install Cross-Reference Directory subprocedure is responsible for retrieving and adding a document(DocURL) 
to set of dictionaries (directories) used by Annotation Proxy Procedure. The Uninstall Cross-Reference Directory sub- 
procedure is responsible for deleting the appropriate installed directories depending upon the value of the DocURL 
is parameter in the subprocedure call. If DocURL = ~, then all of the installed directories are deleted; otherwise, only the 
directory specified by the DocURL parameter is deleted. 

The Request and Merge Document (DocURL) subprocedure is responsible for requesting and receiving document 
specified by the DocURL parameter in the subprocedure call. For all items in all installed cross-reference directories, 
the subprocedure finds or locates all text matching a specified pattern and inserts (annotates) a cross-reference to cor- 
20 responding document. It then sends the merged document to the requester, where the requestor may be the client or 
may be another proxy. 

Cross-reference directories may originate or be provided by various entities. For example, cross-reference diction- 
aries may be prepared by information service providers, educational institutions, publishers, good Samaritans, and the 
like for use by a variety of users. Such predefined cross-reference directories are at known URLs. Cross-reference 
directories may also be generated by the client or a workgroup associated with the client for non-public or other control- 
led use with his or her own documents. ( , 
Cross reference directories 112 prepared by the client include at least two types. A first type of dictionary, referred 
to here as a "frequency of occurance directory." may be maintained in a manner that automatically keeps track of the 
most frequently referenced Web pages and the key words associated with their hypertext links. In a second type of dic- 
tionary referred to here as a "user maintainable directory" the directory may be maintained in a manner such that the 
Web browser includes a link to an optional directory generator 1 16 that allows the client/user to modify the dictionary 
1 12 by for example instructing the directory generator 1 16 via the Web browser 1 10 to "add a reference to this partic- 
ular'document to my personal cross-reference directory", or by editing the match pattern criteria if the user doesnl like 
the default matching pattern provided in an existing annotation directory. Aspects of the two user generated dictionaries 
35 may be combined and either or both may be used in combination with predefined dictionaries created or maintained by 

In another embodiment of the invention, the cross-reference directories 112 may be self generating, and are 
referred to here as a "self -generating directories." In such a self generating cross reference directory 112, a directory 
generator 116 is provided on or in association with a document provider, web information server 104, client computer 
40 102. annotation proxy server 1 18, or any other location on network 100 through which documents pass and could be 
read to construct a cross-reference directory. 

In simplest terms, directory generator 1 16 "reads" documents and identifies, statistically analyzes, and stores, the 
links between particular terms present in the document and cross-linked references within that document, and/or 
between one document source and another document source generally. The cross-reference dictionary 1 12. 191 . 192 
is built-up and improved over time as the number of documents read and contributing to the directory increases. Various 
rules are advantageously implemented in the directory generator 1 16 to provide predictability to the automatically gen- 
erated dictionary. 

In the embodiment of the invention illustrated in FIG. 1 . the directory generator 1 1 6 is shown in association with the 
client computer 102. This may be the preferred location for constructing a personal user annotation directory because 
the annotations and cross references are derived from documents requested by the particular user and the cross ref- 
erences are expected to be relevant to the users interests. On the other hand, a directory generator residing elsewhere 
on the network 100 that sees a large number of documents is better positioned to construct a very complete and hier- 
archically deep annotation directory. Such a directory may be somewhat disadvantageous because of its potential size, 
and may include cross references that are somewhat irrelevant to a client computers needs. 
-5 In the preferred embodiment that includes the dictionary generator 116, the "match pattern" for each cross refer- 
ence item 191 1 92 in the automatically generated dictionary is the text for the hyperlink used to request the document. 
Alternately, the match pattern in the dictionary may be the text for the hypertext link plus a predefined amount of the 
preceding text (e.g.. the preceding text going back to the beginning of the sentence or document section, but not more 
than X words) Furthermore, the document merger procedure 122 in this embodiment inserts annotations even when 
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Claims 

1 . In a distributed computer system incorporating a plurality of servers used to store documents, each said document 
having a unique document identifier, and a client computer having a browser configured to request and receive said 
documents over said distributed computer system, an annotation system for automatically adding to a requested 
document cross references to other documents, said annotation system comprising: 

at least one directory o1 cross references to documents, each cross referenced document having a unique 
source identifier; and 

an annotation proxy configured to form a merged document by merging said requested document from a first 
server with annotations comprising cross references to documents referenced by said at least one directory 
and to relay said merged document to a receiver selected from another proxy or said browser. 

2. The system of claim 1. wherein said annotations are hypertext links defined using hypertext mark up language 
(HTML). 

3. The system of claim 1 , wherein said annotations are hypertext links, and said directory of cross references to doc- 
uments includes entries, each entry comprising: 

a document identifier specifying a document; and 

a pattern, indicating criteria for inserting said document identifier into said set requested document when cre- 
ating said merged document. 

4. The system of claim 3. wherein at least a subset of said entries each includes a relevance indicator, indicating likely 
relevance of said document. 

5. The system of claim 1 , wherein said annotation proxy includes instructions for accepting commands from said cli- 
ent computer identifying a set of directories to use when annotating said requested document, and for forming said 
merged document by merging said requested document with annotations comprising cross references to docu- 
ments referenced by said client computer identified set of directories. 

6. A method for automatically adding to a requested document cross references to other documents, said method 
comprising the steps of: 

recognizing a request for a stored document by a client; 

transmitting said requested document to an annotation proxy for annotation; 

providing, in association with said annotation proxy, at least one directory of cross references to documents, 
each cross referenced document having a unique source identifier; 

merging said requested document with annotations comprising cross references to documents referenced by 
said at least one directory; and 

relaying said merged document to a receiver selected from another proxy or said client. 

7. The method of claim 6. wherein said annotations are hypertext links defined using hypertext mark up language 
(HTML). 

8. The method of claim 6. wherein said annotations are hypertext links, and said directory of cross references to doc- 
uments indudes entries, each entry comprising: 

a document identifier specifying a document; and 

a pattern, indicating criteria for inserting said document identifier into said set requested document when cre- 
ating said merged document. 

9. The method of claim 8. wherein at least a subset of said entries each includes a relevance indicator, indicating likely 
relevance of said document. 

10. The method of daim 6. including accepting commands from said client identifying a set of directories to use when 
annotating said requested document, and forming said merged document by merging said requested document 
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